Skip to content

Conversation

@ironcladlou
Copy link
Contributor

Refactor the integration test to:

  1. Uninstall the CVO and any existing Cluster Ingress Operator deployment
  2. Re-install the operator
  3. Do a smoke test against the default router
  4. Tear everything down

This is less than ideal long term and should probably be replaced by proper e2e
tests, but in the meantime it should provide a more uniform testing strategy as
we iterate.

Refactor the integration test to:

1. Uninstall the CVO and any existing Cluster Ingress Operator deployment
2. Re-install the operator
3. Do a smoke test against the default router
5. Tear everything down

This is less than ideal long term and should probably be replaced by proper e2e
tests, but in the meantime it should provide a more uniform testing strategy as
we iterate.
@openshift-ci-robot openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Oct 16, 2018
@ironcladlou
Copy link
Contributor Author

Here's what success looks like.

First, pushing a new operator image from my local tree:

$ REPO=docker.io/ironcladlou/origin-cluster-ingress-operator make release-local                                                                                                                                 
MANIFESTS=/var/folders/x1/64frcn0d4_gb1028yj3fxxm80000gn/T/tmp.BQblS1kr hack/release-local.sh                                                                                                                   
[integration-test-refactor 5f9f55eb] Temporary
 7 files changed, 27 insertions(+), 30 deletions(-)
Sending build context to Docker daemon  214.4MB                                                                                                                                                                 
Step 1/11 : FROM openshift/origin-release:golang-1.10 as builder                                                                                                                                                
 ---> d9ac547ae5b0                                                                                                                                                                                              
Step 2/11 : COPY . /go/src/github.com/openshift/cluster-ingress-operator/                                                                                                                                       
 ---> 66eea6da4525                                                                                                                                                                                              
Step 3/11 : RUN cd /go/src/github.com/openshift/cluster-ingress-operator && make build                                                                                                                          
 ---> Running in bfc21786ad09                                                                                                                                                                                   
GOOS=linux go build -o cluster-ingress-operator github.com/openshift/cluster-ingress-operator/cmd/cluster-ingress-operator                                                                                      
Removing intermediate container bfc21786ad09                                                                                                                                                                    
 ---> c9e962f46e65                                                                                                                                                                                              
Step 4/11 : FROM centos:7                                                                                                                                                                                         ---> 5182e96772bf                                                                                                                                                                                               Step 5/11 : LABEL io.openshift.release.operator true                                                                                                                                                            
 ---> Using cache                                                                                   
 ---> 421a00ff787c                                                                        
Step 6/11 : LABEL io.k8s.display-name="OpenShift cluster-ingress-operator"       io.k8s.description="This is a component of OpenShift Container Platform and manages the lifecycle of cluster ingress components."       maintainer="Dan Mace <dmace@redhat.com>"                                     
 ---> Using cache                                                                                                                                                                                               
 ---> c0db07595589                                                                                                                                                                                              
Step 7/11 : COPY --from=builder /go/src/github.com/openshift/cluster-ingress-operator/cluster-ingress-operator /usr/bin/                                                                                        
 ---> Using cache                                                                                                                                                                                               
 ---> a72e5eafbe0e                                                                                                                                                                                              
Step 8/11 : COPY manifests /manifests                                                                                                                                                                           
 ---> Using cache                                                                                                                                                                                               
 ---> c9c14919a4b2                                                                                                         
Step 9/11 : RUN useradd cluster-ingress-operator                                                                  
 ---> Using cache                                                                                                                                                                                               
 ---> e47bbb18cf19       
Step 10/11 : USER cluster-ingress-operator                                                                               
 ---> Using cache                                                                                                                       
 ---> a355f07e6f4f                                                                                                                                                                                              
Step 11/11 : ENTRYPOINT ["/usr/bin/cluster-ingress-operator"]                                                                                            
 ---> Using cache
 ---> fff3773b78b4
Successfully built fff3773b78b4
Successfully tagged ironcladlou/origin-cluster-ingress-operator:5f9f55eb
The push refers to repository [docker.io/ironcladlou/origin-cluster-ingress-operator]
48a289cabcdf: Layer already exists
c82dd64f0350: Layer already exists
dd14fc350498: Layer already exists
1d31b5806ba4: Layer already exists
5f9f55eb: digest: sha256:b69317342d5387047b39ef2609a7df58046e9296a2fdd5ce980dd9636b128548 size: 1156
Pushed docker.io/ironcladlou/origin-cluster-ingress-operator:5f9f55eb
Install manifests using:

oc apply -f /var/folders/x1/64frcn0d4_gb1028yj3fxxm80000gn/T/tmp.BQblS1kr

Then, running the test using the newly pushed image:

$ CLUSTER_NAME=dmace MANIFESTS=/var/folders/x1/64frcn0d4_gb1028yj3fxxm80000gn/T/tmp.BQblS1kr make test-integration                                                                                              
hack/test-integration.sh
=== RUN   TestIntegration
time="2018-10-16T17:40:33-04:00" level=info msg="cmd output: daemonset.extensions/cluster-version-operator not patched\n"                                                                                       
time="2018-10-16T17:40:34-04:00" level=info msg="cmd output: Error from server (NotFound): namespaces \"openshift-ingress\" not found\n"                                                                        
time="2018-10-16T17:40:35-04:00" level=info msg="cmd output: error: the server doesn't have a resource type \"clusteringresses\"\n"                                                                             
time="2018-10-16T17:40:35-04:00" level=info msg="cmd output: Error from server (NotFound): namespaces \"openshift-cluster-ingress-operator\" not found\n"                                                       
time="2018-10-16T17:40:36-04:00" level=info msg="cmd output: Error from server (NotFound): namespaces \"openshift-cluster-ingress-router\" not found\n"                                                         
time="2018-10-16T17:40:36-04:00" level=info msg="cmd output: Error from server (NotFound): clusterroles.rbac.authorization.k8s.io \"cluster-ingress-operator:operator\" not found\n"                            
time="2018-10-16T17:40:36-04:00" level=info msg="cmd output: Error from server (NotFound): clusterroles.rbac.authorization.k8s.io \"cluster-ingress:router\" not found\n"                                       
time="2018-10-16T17:40:36-04:00" level=info msg="cmd output: Error from server (NotFound): clusterrolebindings.rbac.authorization.k8s.io \"cluster-ingress-operator:operator\" not found\n"                     
time="2018-10-16T17:40:37-04:00" level=info msg="cmd output: Error from server (NotFound): clusterrolebindings.rbac.authorization.k8s.io \"cluster-ingress:router\" not found\n"                                
time="2018-10-16T17:40:38-04:00" level=info msg="cmd output: Error from server (NotFound): customresourcedefinitions.apiextensions.k8s.io \"clusteringresses.ingress.openshift.io\" not found\n"                
time="2018-10-16T17:40:39-04:00" level=info msg="cmd output: clusterrole.rbac.authorization.k8s.io/cluster-ingress-operator:operator created\ncustomresourcedefinition.apiextensions.k8s.io/clusteringresses.ingress.openshift.io created\nnamespace/openshift-cluster-ingress-operator created\nclusterrolebinding.rbac.authorization.k8s.io/cluster-ingress-operator:operator created\nrolebinding.rbac.authorization.k8s.io/cluster-ingress-operator created\nrole.rbac.authorization.k8s.io/cluster-ingress-operator created\nserviceaccount/cluster-ingress-operator created\ndeployment.apps/cluster-ingress-operator created\n"            
=== RUN   TestIntegration/TestDefaultIngress
time="2018-10-16T17:40:39-04:00" level=info msg="testing in namespace ingress-test-qpwzdq"
time="2018-10-16T17:43:34-04:00" level=info msg="cmd output: warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.\nclusteringress.ingress.openshift.io \"default\" force deleted\n"
time="2018-10-16T17:44:10-04:00" level=info msg="cmd output: namespace \"openshift-cluster-ingress-operator\" deleted\n"                                                                                        
time="2018-10-16T17:44:30-04:00" level=info msg="cmd output: namespace \"openshift-cluster-ingress-router\" deleted\n"                                                                                          
time="2018-10-16T17:44:30-04:00" level=info msg="cmd output: clusterrole.rbac.authorization.k8s.io \"cluster-ingress-operator:operator\" deleted\n"                                                             
time="2018-10-16T17:44:30-04:00" level=info msg="cmd output: clusterrole.rbac.authorization.k8s.io \"cluster-ingress:router\" deleted\n"                                                                        
time="2018-10-16T17:44:31-04:00" level=info msg="cmd output: clusterrolebinding.rbac.authorization.k8s.io \"cluster-ingress-operator:operator\" deleted\n"                                                      
time="2018-10-16T17:44:32-04:00" level=info msg="cmd output: clusterrolebinding.rbac.authorization.k8s.io \"cluster-ingress:router\" deleted\n"                                                                 
time="2018-10-16T17:44:33-04:00" level=info msg="cmd output: customresourcedefinition.apiextensions.k8s.io \"clusteringresses.ingress.openshift.io\" deleted\n"                                                 
--- PASS: TestIntegration (240.90s)
    --- PASS: TestIntegration/TestDefaultIngress (172.00s)
        test_ingress.go:105: service openshift-cluster-ingress-router/router-default has ingress.Hostname a270d52ddd18c11e89d3c0a1c7b32d5f-1772344121.us-east-1.elb.amazonaws.com                               
PASS
ok      github.com/openshift/cluster-ingress-operator/test/integration  240.936s

@ironcladlou
Copy link
Contributor Author

/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 16, 2018
sdk.Watch(resource, kind, tc.operatorNamespace, resyncPeriod)
sdk.Handle(stub.NewHandler())
go sdk.Run(context.TODO())
func (tc *TestConfig) uninstallOperator(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should all this teardown stuff go in a script so it can be called outside the code? That's what I do for manual testing anyway

@@ -0,0 +1,32 @@
#/bin/bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing a !.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

t.Fatalf("KUBECONFIG is required")
}
// The operator-sdk uses KUBERNETES_CONFIG...
os.Setenv("KUBERNETES_CONFIG", kubeConfig)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this os.Setenv still needed now that test-integration.sh sets KUBERNETES_CONFIG?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gone

echo "Pushed $REPO:$REV"
echo "Install manifests using:"
echo ""
echo "oc apply -f $MANIFESTS"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing -R for recursion.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think -f does the recursion.


return config
// reinstall the operator
tc.runShellCmdNonFatal(t, fmt.Sprintf(`oc apply -f %s`, tc.manifestsDir))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing -R for recursion.

@ironcladlou
Copy link
Contributor Author

I want to try a commit that extracts a lot of this into steps that can be composed

},
ObjectMeta: metav1.ObjectMeta{
Name: "router-default",
Namespace: "openshift-cluster-ingress-router",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the namespace be namespace?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified this in a comment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that the // Hard-coded reference to the default router. comment? I guess I didn't realize the test was using a router deployed outside the test. Seems a little strange that the operator's "TestDefaultIngress" integration test doesn't use the operator to set up a default ingress—is the plan eventually to do that?

func NewTestConfig(t *testing.T) *TestConfig {
config := &TestConfig{t: t}

func NewTestConfig(t *testing.T, clusterName string, manifestsDir string) *TestConfig {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little confusing to use the same names for the parameters as you used for the global variables for the flags.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redid all this

tc.runShellCmdNonFatal(t, `oc delete clusterroles/cluster-ingress:router`)
tc.runShellCmdNonFatal(t, `oc delete clusterrolebindings/cluster-ingress-operator:operator`)
tc.runShellCmdNonFatal(t, `oc delete clusterrolebindings/cluster-ingress:router`)
tc.runShellCmdNonFatal(t, `oc delete customresourcedefinition.apiextensions.k8s.io/clusteringresses.ingress.openshift.io`)
Copy link
Contributor

@Miciah Miciah Oct 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you replace all the individual deletes with tc.runShellCmdNonFatal(t, fmt.Sprintf("oc delete -f %s -R", tc.manifestsDir))?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted all this into a script

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about using -R with oc delete in this case; I intentionally ordered the delete to minimize contention (e.g. tearing down the operator before routers to avoid fighting with the operator which wants to keep creating routers). Although with --force --grace-period 0 on the namespaces themselves I'm not entirely sure it matters. Worth doing some experiments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's not a big deal I'd ask we keep what I have in the script for now (which works) and optimize it later if possible.

Separate the component parts of the integration tests to faciliate manual
testing and to simplify the testing code.
@openshift-ci-robot openshift-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 16, 2018
@ironcladlou
Copy link
Contributor Author

Okay, I redid a lot of this in an attempt to make the whole thing simpler and also more flexible. With a few additional steps on the part of the user, it should be possible to make use of all this stuff for both verifying our work outside CI and also iterative development through ad-hoc use of each piece (image builds, test, uninstaller). PTAL!

metadata:
name: cluster-ingress-test
labels:
openshift.io/cluster-ingress-operator-test: ""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reserved for future use: should make it easier to identify test namespaces for cleanup.

Copy link
Contributor

@imcsk8 imcsk8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me


if [ -z "${CLUSTER_NAME}" ]; then echo "CLUSTER_NAME is required"; exit 1; fi
if [ -z "${MANIFESTS}" ]; then echo "MANIFESTS is required"; exit 1; fi
if [ -z "${KUBECONFIG}" ]; then echo "KUBECONFIG is required"; exit 1; fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use ${KUBECONFIG:-} here or KUBECONFIG="${KUBECONFIG:-} above or else you will get "bash: KUBECONFIG: unbound variable" instead of the desired error message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

oc delete clusterroles/cluster-ingress:router || true
oc delete clusterrolebindings/cluster-ingress-operator:operator || true
oc delete clusterrolebindings/cluster-ingress:router || true
oc delete customresourcedefinition.apiextensions.k8s.io/clusteringresses.ingress.openshift.io || true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of ending every command with || true, you could just omit the set -e.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

},
ObjectMeta: metav1.ObjectMeta{
Name: "router-default",
Namespace: "openshift-cluster-ingress-router",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that the // Hard-coded reference to the default router. comment? I guess I didn't realize the test was using a router deployed outside the test. Seems a little strange that the operator's "TestDefaultIngress" integration test doesn't use the operator to set up a default ingress—is the plan eventually to do that?

# Uninstall tectonic ingress
oc delete namespaces/openshift-ingress || true

# Uninstall cluster-dns-operator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/dns/ingress/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@ironcladlou
Copy link
Contributor Author

@Miciah

Is that the // Hard-coded reference to the default router. comment? I guess I didn't realize the test was using a router deployed outside the test. Seems a little strange that the operator's "TestDefaultIngress" integration test doesn't use the operator to set up a default ingress—is the plan eventually to do that?

Yes, the test executes against a cluster where the operator is assumed to already be running. For now at least, this is the only way we can consistently ensure our changes actually work inside a 4.0 cluster, including exercising RBAC, admission control within an operator namespace, etc.

This is probably technically now more of an e2e test, but some sort of consistent tooling for iterating like this was desperately needed. If anything, I'd imagine the tooling will continue to be of greater use than the test itself as the e2e tests are developed (e.g. you could use the uninstall/release image tooling to set up a cluster for iterating on e2e tests that live in origin and which execute with CI).

I don't see a lot of useful middle ground between unit and e2e tests at the moment- we had too many problems which only presented themselves only when running in-cluster.

@ironcladlou
Copy link
Contributor Author

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 17, 2018
@Miciah
Copy link
Contributor

Miciah commented Oct 17, 2018

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 17, 2018
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ironcladlou, Miciah

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 2c61a20 into openshift:master Oct 17, 2018
Copy link

@pravisankar pravisankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

echo "Pushed $REPO:$REV"
echo "Install manifests using:"
echo ""
echo "oc apply -f $MANIFESTS"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think -f does the recursion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants